Enable fp8e5m2fnuz type #3570

CharlieL7 · 2024-10-29T18:33:00Z

Enables the E5M2 FNUZ datatype so that we have full support for all the current FP8 types
E5M2 FNUZ will not be used directly in model but might be useful when converting from OCP -> FNUZ
Lots of updated files, but mostly expanding test cases

…e5m2fnuz_type

src/api/include/migraphx/migraphx.h

TedThemistokleous · 2024-11-04T23:25:45Z

test/verify/test_gemm_add_broadcast2.cpp

@@ -55,5 +55,6 @@ template struct test_gemm_add_broadcast2<migraphx::shape::float_type>;
 // template struct test_gemm_add_broadcast2<migraphx::shape::half_type>; // fails with CK,


Do we have a CK feature for this or is there a workaround?

TedThemistokleous · 2024-11-04T23:26:00Z

test/verify/test_gemm_add.cpp

@@ -58,3 +58,6 @@ struct test_gemm_add : verify_program<test_gemm_add<DType>>
 template struct test_gemm_add<migraphx::shape::float_type>;
 template struct test_gemm_add<migraphx::shape::half_type>;
 // TODO template struct test_gemm_add<migraphx::shape::fp8e4m3fnuz_type>;
+// TODO template struct test_gemm_add<migraphx::shape::fp8e5m2fnuz_type>;
+// TODO template struct test_gemm_add<migraphx::shape::fp8e4m2fn_type>;
+// TODO template struct test_gemm_add<migraphx::shape::fp8e5m2_type>;


Why are these TODO?

these don't work, umang made an issue about them earlier

TedThemistokleous · 2024-11-04T23:26:23Z

test/verify/test_gemm_transposea.cpp

@@ -46,6 +46,7 @@ struct test_gemm_transposea : verify_program<test_gemm_transposea<DType>>
 template struct test_gemm_transposea<migraphx::shape::float_type>;
 template struct test_gemm_transposea<migraphx::shape::half_type>;
 template struct test_gemm_transposea<migraphx::shape::fp8e4m3fnuz_type>;
+template struct test_gemm_transposea<migraphx::shape::fp8e5m2fnuz_type>;
 // TODO need hipblaslt support
 // template struct test_gemm_transposea<migraphx::shape::fp8e4m3fn_type>;
 // template struct test_gemm_transposea<migraphx::shape::fp8e5m2_type>;


Anyway to workaround and run these tests without hipblastlt?

test/verify/test_multinomial.cpp

TedThemistokleous

Initial comments. Few questions, nothing concerning after you fixed the API compatability.

Just more following up on some of your comments with tests and things.

src/include/migraphx/half.hpp

… into enable_fp8e5m2fnuz_type

codecov · 2024-11-19T22:11:04Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.19%. Comparing base (da545d2) to head (e7f0d4b).
Report is 1 commits behind head on develop.

Additional details and impacted files

@@           Coverage Diff            @@
##           develop    #3570   +/-   ##
========================================
  Coverage    92.19%   92.19%           
========================================
  Files          513      513           
  Lines        21633    21638    +5     
========================================
+ Hits         19945    19950    +5     
  Misses        1688     1688

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

… into enable_fp8e5m2fnuz_type

…form/AMDMIGraphX into enable_fp8e5m2fnuz_type

migraphx-bot · 2024-11-27T22:00:54Z

Test	Batch	Rate new e7f0d4	Rate old da545d	Diff	Compare
torchvision-resnet50	64	3,257.68	3,255.84	0.06%	✅
torchvision-resnet50_fp16	64	6,998.19	6,991.64	0.09%	✅
torchvision-densenet121	32	2,436.13	2,431.85	0.18%	✅
torchvision-densenet121_fp16	32	4,093.27	4,070.27	0.56%	✅
torchvision-inceptionv3	32	1,627.63	1,628.09	-0.03%	✅
torchvision-inceptionv3_fp16	32	2,743.18	2,747.87	-0.17%	✅
cadene-inceptionv4	16	765.37	764.69	0.09%	✅
cadene-resnext64x4	16	810.93	806.89	0.50%	✅
slim-mobilenet	64	7,463.58	7,464.92	-0.02%	✅
slim-nasnetalarge	64	208.46	208.40	0.03%	✅
slim-resnet50v2	64	3,440.51	3,441.42	-0.03%	✅
bert-mrpc-onnx	8	1,145.13	1,145.08	0.00%	✅
bert-mrpc-tf	1	461.72	461.72	0.00%	✅
pytorch-examples-wlang-gru	1	422.34	429.74	-1.72%	✅
pytorch-examples-wlang-lstm	1	393.67	480.37	-18.05%	🔴
torchvision-resnet50_1	1	772.55	770.17	0.31%	✅
cadene-dpn92_1	1	416.14	403.49	3.14%	🔆
cadene-resnext101_1	1	382.82	381.97	0.22%	✅
onnx-taau-downsample	1	346.01	346.00	0.00%	✅
dlrm-criteoterabyte	1	33.32	33.31	0.02%	✅
dlrm-criteoterabyte_fp16	1	52.76	52.72	0.06%	✅
agentmodel	1	8,450.66	8,212.54	2.90%	✅
unet_fp16	2	58.82	58.71	0.18%	✅
resnet50v1_fp16	1	942.64	938.46	0.44%	✅
resnet50v1_int8	1	1,005.45	999.92	0.55%	✅
bert_base_cased_fp16	64	1,170.40	1,169.75	0.06%	✅
bert_large_uncased_fp16	32	363.16	363.16	-0.00%	✅
bert_large_fp16	1	200.44	200.11	0.16%	✅
distilgpt2_fp16	16	2,197.87	2,198.98	-0.05%	✅
yolov5s	1	534.90	533.15	0.33%	✅
tinyllama	1	43.62	43.41	0.49%	✅
vicuna-fastchat	1	174.11	173.49	0.36%	✅
whisper-tiny-encoder	1	418.39	417.39	0.24%	✅
whisper-tiny-decoder	1	424.68	427.99	-0.77%	✅

This build is not recommended to merge 🔴

migraphx-bot · 2024-11-27T22:00:56Z

✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

✅ cadene-resnext101_1: PASSED: MIGraphX meets tolerance

✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

✅ agentmodel: PASSED: MIGraphX meets tolerance

✅ unet: PASSED: MIGraphX meets tolerance

✅ resnet50v1: PASSED: MIGraphX meets tolerance

✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

✅ bert_large: PASSED: MIGraphX meets tolerance

✅ yolov5s: PASSED: MIGraphX meets tolerance

✅ tinyllama: PASSED: MIGraphX meets tolerance

✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

TedThemistokleous · 2024-11-29T05:45:07Z

Should get this in sooner than later @causten . Onnxrt has support for this as well form the looks of it so it would be nice to get his all in for fp8 support at once

CharlieL7 added 6 commits October 28, 2024 14:07

initial

911e163

formatting

d8c6d53

Add switch condition on new type

42456db

Add as_float gpu function

2c37114

Merge branch 'develop' of github.com:ROCm/AMDMIGraphX into enable_fp8…

6bc6ba9

…e5m2fnuz_type

add gpu jit test

efc3cbf

CharlieL7 added the FP8 issues related to FP8 implemenation label Oct 29, 2024

CharlieL7 requested review from pfultz2 and TedThemistokleous October 29, 2024 18:33

CharlieL7 self-assigned this Oct 29, 2024

CharlieL7 requested a review from causten as a code owner October 29, 2024 18:33

pfultz2 reviewed Oct 29, 2024

View reviewed changes

src/api/include/migraphx/migraphx.h Outdated Show resolved Hide resolved

CharlieL7 added 2 commits October 29, 2024 15:50

Change order on API

ce05928

Merge branch 'develop' into enable_fp8e5m2fnuz_type

14c7881

TedThemistokleous reviewed Nov 4, 2024

View reviewed changes

test/verify/test_multinomial.cpp Show resolved Hide resolved

TedThemistokleous reviewed Nov 4, 2024

View reviewed changes

pfultz2 mentioned this pull request Nov 9, 2024

Use the float8 and generic_float template for specialization #3606

Merged

pfultz2 reviewed Nov 9, 2024

View reviewed changes

src/include/migraphx/half.hpp Outdated Show resolved Hide resolved

CharlieL7 added 2 commits November 19, 2024 14:19

Merge branch 'develop' of github.com:ROCmSoftwarePlatform/AMDMIGraphX…

77d5580

… into enable_fp8e5m2fnuz_type

change ordering for shape visit types

32fd54f

Merge branch 'develop' of github.com:ROCmSoftwarePlatform/AMDMIGraphX…

36c6c4b

… into enable_fp8e5m2fnuz_type

CharlieL7 marked this pull request as draft November 22, 2024 18:46

Fix tests

fbfb8d3

CharlieL7 marked this pull request as ready for review November 25, 2024 20:19

CharlieL7 added 2 commits November 25, 2024 15:19

Merge branch 'develop' into enable_fp8e5m2fnuz_type

910bd84

Formatting

f837d5f

Merge branch 'enable_fp8e5m2fnuz_type' of github.com:ROCmSoftwarePlat…

0195fff

…form/AMDMIGraphX into enable_fp8e5m2fnuz_type

CharlieL7 requested review from pfultz2 and TedThemistokleous November 25, 2024 20:39

formatting again

1122f01

pfultz2 approved these changes Nov 25, 2024

View reviewed changes

TedThemistokleous approved these changes Nov 26, 2024

View reviewed changes

Merge branch 'develop' into enable_fp8e5m2fnuz_type

e7f0d4b

TedThemistokleous added the high priority A PR with high priority for review and merging. label Nov 29, 2024

causten merged commit 35fd39f into develop Nov 29, 2024
43 of 45 checks passed

causten deleted the enable_fp8e5m2fnuz_type branch November 29, 2024 15:54

shivadbhavsar pushed a commit that referenced this pull request Dec 18, 2024

Enable fp8e5m2fnuz type (#3570)

5e18b64

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable fp8e5m2fnuz type #3570

Enable fp8e5m2fnuz type #3570

CharlieL7 commented Oct 29, 2024

TedThemistokleous Nov 4, 2024

TedThemistokleous Nov 4, 2024

CharlieL7 Nov 19, 2024

TedThemistokleous Nov 4, 2024

TedThemistokleous left a comment

codecov bot commented Nov 19, 2024 •

edited

Loading

migraphx-bot commented Nov 27, 2024

migraphx-bot commented Nov 27, 2024

TedThemistokleous commented Nov 29, 2024

		@@ -55,5 +55,6 @@ template struct test_gemm_add_broadcast2<migraphx::shape::float_type>;
		// template struct test_gemm_add_broadcast2<migraphx::shape::half_type>; // fails with CK,

Enable fp8e5m2fnuz type #3570

Enable fp8e5m2fnuz type #3570

Conversation

CharlieL7 commented Oct 29, 2024

TedThemistokleous Nov 4, 2024

Choose a reason for hiding this comment

TedThemistokleous Nov 4, 2024

Choose a reason for hiding this comment

CharlieL7 Nov 19, 2024

Choose a reason for hiding this comment

TedThemistokleous Nov 4, 2024

Choose a reason for hiding this comment

TedThemistokleous left a comment

Choose a reason for hiding this comment

codecov bot commented Nov 19, 2024 • edited Loading

Codecov Report

migraphx-bot commented Nov 27, 2024

migraphx-bot commented Nov 27, 2024

TedThemistokleous commented Nov 29, 2024

codecov bot commented Nov 19, 2024 •

edited

Loading